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Abstract 



This article systematically reviews research on the achievement outcomes of four types of 
approaches to improving the reading success of children in the elementary grades: reading 
curricula, instructional technology, instructional process programs, and combinations of curricula 
and instructional process. Study inclusion criteria included use of randomized or matched 
control groups, a study duration of at least 12 weeks, valid achievement measures independent of 
the experimental treatments, and a final assessment at the end of grade 1 or later. A total of 63 
beginning reading (starting in K or 1) and 79 upper elementary (2-5) reading studies met these 
criteria. The review concludes that instructional process programs designed to change daily 
teaching practices have substantially greater research support than programs that focus on 
curriculum or technology alone. 
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From the first day of kindergarten to the last day of elementary school, children 
substantially define themselves as readers, and this has enormous influence on their development 
as learners and as members of society. Those who succeed in becoming fluent, strategic, and 
joyful readers are not guaranteed success in school or in life, but they are well on their way. 
However, those who do not succeed in reading, or who become reluctant readers, face long odds 
in achieving success in school and life. Every educator, parent, and policy maker knows the 
critical importance of reading in the elementary grades. Further, the gap in reading performance 
between different ethnic groups, and between middle class and disadvantaged children, is 
perhaps the most important policy issue in education in the U.S. Because of the obvious 
importance of success in reading, schools invest enormous sums in initial teaching of reading 
and in remedial services for struggling readers. 

Given the great importance of success in reading for millions of children and for our 
society as a whole, one would imagine that there would be a great deal of research on how 
teachers can most effectively teach children to read. There is in fact a great deal of basic research 
on reading, and we know a lot about how children learn to read and what goes wrong when they 
fail to learn (see for example National Reading Panel, 2000; Snow, Burns, & Griffin, 1998; 
National Early Literacy Panel, 2008). Yet there is much less research evaluating the practical 
programs actually available to schools and teachers to ensure reading success, and the research 
that does exist has not been comprehensively reviewed. 

It is useful, for example, to know that effective beginning reading programs emphasize 
phonemic awareness, phonics, fluency, vocabulary, and comprehension, as concluded by the 
National Reading Panel (NRP, 2000). Reviews by Adams (1990) and by Snow, Burns, & Griffin 
(1998), as well as the NRP, have supported the importance of teaching with a strong emphasis on 
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phonics and phonemic awareness. Yet school leaders and teachers do not choose between 
“phonics” and “no phonics.” Instead, they choose among particular textbooks, software, and 
professional development approaches. Any particular program may incorporate the five NRP 
elements to a greater or lesser degree, but each also incorporates other features (such as 
classroom organization, motivation, grouping, assessment, and professional development) that 
also determine the outcomes of the program. 

The importance of focusing attention on all aspects of reading approaches, not just on 
phonics or other NRP elements, was illustrated by the experience of the federal Reading First 
program. Based in large part on the findings of the National Reading Panel (2000) and earlier 
research syntheses, the Reading First program favored phonics and phonemic awareness, and a 
national study of Reading First by Gamse et al. (2008) and Moss et al. (2008) found that teachers 
in Reading First schools were in fact doing more phonics teaching than were those in similar 
non-Reading First schools. Yet outcomes were disappointing, with small effects seen on first 
grade decoding measures and no impact on comprehension measures in grades 1-3. Similarly, a 
large study of intensive professional development focusing on phonics found no effects on the 
reading skills of second graders (Garet et al., 2008). The findings of these large-scale 
experiments imply that while the importance of phonics and phonemic awareness in reading 
instruction are well established, the addition of phonics to traditional basal instruction is not 
sufficient to bring about widespread improvement in children’s reading. Other factors, especially 
relating to teaching methods, are also consequential. 

The What Works Clearinghouse (WWC, 2009), in its beginning reading topic report, 
reviewed research on reading programs evaluated in grades K through 3. However, the WWC 
only reports program ratings, and does not include discussion of the findings or draw 
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generalizations about the effects of types of programs. Further, WWC inclusion standards 
applied in its beginning reading topic report include very brief studies (as few as 5 hours of 
instruction), and studies that used measures of skills taught in experimental but not control 
groups. It does not weight by sample size, and many of its conclusions are based on atypical 
effect sizes from studies with sample sizes as small as 46 (see Slavin, 2008). 

The present article reviews research on the achievement outcomes of practical initial 
(non-remedial) reading programs for all elementary children, grades K-5, applying consistent 
methodological standards to the research. It is intended to provide fair summaries of the 
achievement effects of the full range of reading approaches available to educators and policy 
makers, and to summarize for researchers the current state of the art in this area. The scope of the 
review includes all types of programs that teachers, principals, or superintendents might consider 
to improve the success of their children in reading: curricula, instructional technology, 
instructional process programs, and combinations of curricula and instructional process. The 
review uses a form of best evidence synthesis (Slavin, 1986), adapted for use in reviewing “what 
works” literatures in which there are generally few studies evaluating each of many programs. It 
is part of a series, all of which used the same methods with minor adaptations. Separate 
syntheses review research on remedial, preventive, and special education programs in elementary 
reading (Slavin, Lake, Davis, & Madden, 2009), middle and high school reading programs 
(Slavin, Cheung, Groff, & Lake, 2008), and reading programs for English language learners 
(Cheung & Slavin, 2005). 
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Focus of the Current Review 



The present review uses procedures similar to those used in the secondary reading review 
to examine research on initial (non-remedial) programs for elementary reading. The purpose of 
the review is to place all types of initial reading programs intended to enhance reading 
achievement on a common scale, to provide educators and policy makers with meaningful, 
unbiased information that they can use to select programs most likely to make a difference with 
their students. The review emphasizes practical programs that are or could be used at scale. It 
therefore emphasizes large studies done over significant time periods that used standard 
measures, to maximize the usefulness of the review to educators. The review also seeks to 
identify common characteristics of programs likely to make a difference in reading achievement. 
This synthesis was intended to include all kinds of approaches to reading instruction, and groups 
them in four categories: reading curricula, instructional technology, instructional process 
programs, and combinations of reading curricula and instructional process. Reading curricula 
primarily encompass core reading textbooks and curricula, such as Reading Street and Open 
Court Reading. Instructional technology refers to programs that use technology to enhance 
reading achievement. This includes traditional supplementary computer-assisted instruction 
(CAI) programs, in which students are sent to computer labs for additional practice. Other 
instructional technology programs include Reading Reels, which provides embedded multimedia 
in daily lessons, and Writing to Read, which combines technology and non-technology small 
group activities. Instructional process programs rely primarily on professional development to 
give teachers effective strategies for teaching reading. These include programs focusing on 
cooperative learning, such as PALS and CIRC, and programs focusing on phonics and 
phonological awareness. Curriculum and instructional process programs, specifically Success 
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for All and Direct Instruction, provide specific phonetic curricula as well as extensive 
professional development focused on instructional strategies. Comprehensive school reform 
(CSR) programs were included only if they included specific reading programs; for a broader 
review of outcomes of elementary CSR models, see CSRQ (2006) and Borman et al. (2003). 

Methodolosical Issues Characteristic of Elementary Reading Research 

While this review of research on reading programs shares methodological issues common 
to all systematic reviews, there are also some key issues unique to this subject and grade level. 
The thorniest of these relates to measurement. In the early stages of reading, researchers often 
use measures such as phonemic awareness that are not “reading” in any sense, though they are 
precursors. However, measures of reading comprehension and reading vocabulary tend to have 
floor effects at the kindergarten and first grade levels. The present review included measures 
such as letter- word identification and word attack, but did not accept measures such as auditory 
phonemic awareness. Measures of oral vocabulary, spelling, and language arts were excluded at 
all grade levels. 

Another problem of early reading measurement is that in kindergarten, it is possible for a 
study to find positive effects of programs that introduce skills not ordinarily taught in 
kindergarten on measures of those skills. For example, until the late 1990’s it was not common in 
U.S. kindergartens for children to be taught phonics or phonemic awareness. Programs that 
moved these then first-grade skills into kindergarten might appear very effective in comparison 
to control classes receiving little or no instruction on those skills, but would in fact simply be 
teaching skills the control children would probably have mastered somewhat later. 
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Because of the difficulty of defining and measuring early literacy skills, multi-year 
evaluations of programs that may begin in kindergarten, but follow children at least through the 
end of first or second grade are of particular value. By the end of second grade, it is certain that 
control students as well as experimental students have been seriously taught to read, and it 
becomes possible to use measures of reading comprehension and reading vocabulary that more 
fully represent the goals of reading instruction, not just precursors. Multi-year studies solve the 
problem of early presentation of skills ordinarily taught later. If kindergartners are taught certain 
first grade reading skills, end of first grade or second grade measures should be able to determine 
if this early teaching was truly beneficial. Due to the unique nature of research on kindergarten- 
only programs, studies whose final posttesting took place before spring of first grade are 
reviewed in a separate section of this article. 

Review Methods 

As noted earlier, the review methods used here are adaptations of a technique called best- 
evidence synthesis (Slavin, 1986, 2008). Best-evidence syntheses seek to apply consistent, well- 
justified standards to identify unbiased, meaningful information from experimental studies, 
discussing each study in some detail, and pooling effect sizes across studies in substantively 
justified categories. The method is very similar to meta-analysis (Cooper, 1998; Lipsey & 
Wilson, 2001), adding an emphasis on narrative description of each study’s contribution. It is 
similar to the methods used by the What Works Clearinghouse (2009), with a few important 
exceptions noted in the following sections. See Slavin (2008) for an extended discussion and 
rationale for the procedures used in all of these reviews. 
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Literature Search Procedures 



A broad literature search was carried out in an attempt to locate every study that could 
possibly meet the inclusion requirements. Electronic searches were made of educational 
databases (JSTOR, ERIC, EBSCO, Psych INFO, Dissertation Abstracts) using various 
combinations of key words (for example, “elementary students,” “reading,” “achievement”) and 
the years 1970-2009. Results were then narrowed by subject area (for example, “reading 
intervention,” “educational software,” “academic achievement,” “instructional strategies”). In 
addition to looking for studies by key terms and subject area, we conducted searches by program 
name. Web-based repositories and education publishers’ websites were also examined. We 
attempted to contact producers and developers of reading programs to check whether they knew 
of studies that we had missed. Citations were obtained from other reviews of reading programs 
including the What Works Clearinghouse (2009) beginning reading topic report, Adams (1990), 
National Reading Panel (2000), Snow, Burns & Griffin (1998), Torgerson, Brooks, & Hall 
(2006), Rose (2006), and August & Shanahan (2006), or potentially related topics such as 
instructional technology (E. Chambers, 2003; Kulik, 2003; Murphy et al., 2002). We also 
conducted searches of recent tables of contents of key journals. We searched the following 
tables of contents from 2000 to 2009: American Educational Research Journal, Reading 
Research Quarterly, Journal of Educational Research, Journal of Educational Psychology, 
Reading and Writing Quarterly, British Educational Research Journal, and Learning and 
Instruction. Citations of studies appearing in the studies found in the first wave were also 
followed up. 
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In general, effect sizes were computed as the difference between experimental and 
control individual student posttests after adjustment for pretests and other covariates, divided by 
the unadjusted posttest control group standard deviation. If the control group SD was not 
available, a pooled SD was used. Procedures described by Lipsey & Wilson (2001) and 
Sedlmeier & Gigerenzor (1989) were used to estimate effect sizes when unadjusted standard 
deviations were not available, as when the only standard deviation presented was already 
adjusted for covariates or when only gain score SD’s were available. If pretest and posttest 
means and SD’s were presented but adjusted means were not, effect sizes for pretests were 
subtracted from effect sizes for posttests. In multiyear studies, effect sizes may be reported for 
each year but only the final year of treatment is presented in the tables. However, if there are 
multiple cohorts (e.g., K-l, K-2, K-3), each with adequate pretests, all cohorts are included in the 
tables. 

Effect sizes were pooled across studies for each program and for various categories of 
programs. This pooling used means weighted by the final sample sizes. The reason for using 
weighted means is to maximize the importance of large studies, as the previous reviews and 
many others have found that small studies tend to overstate effect sizes (see Rothstein et ah, 
2005; Slavin & Smith, in press). 

Effect sizes were broken down for measures of decoding (e.g., word attack, letter-word 
identification, and fluency), vocabulary, and comprehension/total reading. In general, 
comprehension, which is the ultimate goal of reading instruction, is the most important outcome 
measure. Very few studies reported separate vocabulary scores, so the tables only show separate 
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outcomes for decoding and comprehension (although vocabulary measures are included in 
totals). 

Criteria for Inclusion 

Criteria for inclusion of studies in this review were as follows. 

1. The studies evaluated initial (i.e., non-remedial) classroom programs for elementary 
reading. Studies of variables, such as use of ability grouping, block scheduling, or single- 
sex classrooms, were not reviewed. Studies of tutoring and remedial programs for 
struggling readers are reviewed in a separate article (Slavin et ah, 2009). 

2. The studies involved interventions that began when children were in elementary school, 
grades K-5. As noted earlier, studies that began and ended in kindergarten are reviewed 
separately. Programs beginning in K or 1 were categorized as beginning reading, while 
those beginning in 2-5 were categorized as upper elementary. 

3. The studies compared children taught in classes using a given reading program to those in 
control classes using an alternative program or standard methods. 

4. Studies could have taken place in any country, but the report had to be available in 
English. 

5. Random assignment or matching with appropriate adjustments for any pretest differences 
(e.g., analyses of covariance) had to be used. Studies without control groups, such as pre- 
post comparisons and comparisons to “expected” scores, were excluded. 

6. Pretest data had to be provided, unless studies used random assignment of at least 30 
units (individuals, classes, or schools) and there were no indications of initial inequality. 
Studies with pretest differences of more than 50% of a standard deviation were excluded 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven 
Reform in Education (CDDRE) under funding from the Institute of Education Sciences, U.S. Department of Education. 



because, even with analyses of covariance, large pretest differences cannot be adequately 
controlled for as underlying distributions may be fundamentally different (Shadish, Cook, 
& Campbell, 2002). 

7. The dependent measures included quantitative measures of reading performance, such as 
standardized reading measures. Experimenter-made measures were accepted if they were 
comprehensive measures of reading, which would be fair to the control groups, but 
measures of reading objectives inherent to the experimental program (but unlikely to be 
emphasized in control groups) were excluded. Studies using measures inherent to 
treatments, usually made by the experimenter or program developer, have been found to 
be associated with much larger effect sizes than are measures that are independent of 
treatments (Slavin & Madden, in press), and for this reason, effect sizes from treatment- 
inherent measures were excluded. The exclusion of measures inherent to the experimental 
treatment is a key difference between the procedures used in the present review and those 
used by the What Works Clearinghouse (2009). Measures of reading individually 
administered by the children’s own teachers were also excuded, on the basis that such 
assessments are susceptible to bias. As noted above, measures of pre-reading skills such 
as phonological awareness, as well as related skills such as oral vocabulary, language 
arts, and spelling, were not included in this review. 

8. A minimum study duration of 12 weeks was required. This requirement is intended to 
focus the review on practical programs intended for use for the whole year, rather than 
brief investigations. Study duration is measured from the beginning of the treatments to 
posttest, so, for example, an intensive 8-week intervention in the fall of first grade would 
be considered a year-long study if the posttest were given in May. The 12 -week criterion 
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has been consistently used in all of the systematic reviews done previously by the current 
authors. This is another difference between the current review and the What Works 
Clearinghouse (2009) beginning reading topic report, which included very brief studies. 

9. Studies had to have at least 15 students and two teachers in each treatment group. 

Limitations 

It is important to note several limitations of the current review. First, the review focuses 
on experimental studies using quantitative measures of reading. There is much to be learned 
from qualitative and correlational research that can add depth and insight to understanding the 
effects of reading programs, but this research is not reviewed here. Second, the review focuses 
on replicable programs used in realistic school settings expected to have an impact over periods 
of at least 12 weeks. This emphasis is consistent with the review’s purpose in providing 
educators with useful information about the strength of evidence supporting various practical 
programs, but it does not attend to shorter, more theoretically-driven studies that may also 
provide useful information, especially to researchers. Finally, the review focuses on traditional 
measures of reading performance, primarily individually-administered or group-administered 
standardized tests. These are useful in assessing the practical outcomes of various programs and 
are fair to control as well as experimental teachers, who are equally likely to be trying to help 
their students do well on these assessments. The review does not report on experimenter-made 
measures of content taught in the experimental group but not the control group, even though 
results on such measures may also be of importance to some researchers or educators. 
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Categories of Research Design 



Four categories of research designs were identified. Randomized experiments (R) were 
those in which students, classes, or schools were randomly assigned to treatments, and data 
analyses were at the level of random assignment. When schools or classes were randomly 
assigned but there were too few schools or classes to justify analysis at the level of random 
assignment, the study was categorized as a randomized quasi-experiment (RQE) (Slavin, 2008). 
Matched (M) studies were ones in which experimental and control groups were matched on key 
variables at pretest, before posttests were known, while matched post-hoc (MPH) studies were 
ones in which groups were matched retrospectively, after posttests were known. Studies using 
fully randomized designs (R) are preferable to randomized quasi-experiments (RQE), but all 
randomized experiments are less subject to bias than matched studies. Among matched designs, 
prospective designs (M) were preferred to post-hoc matched designs (MPH). In the text and in 
tables, studies of each type of program are listed in this order (R, RQE, M, MPH). Within these 
categories, studies with larger sample sizes are listed first. Therefore, studies discussed earlier in 
each section should be given greater weight than those listed later, all other things being equal. 

For Additional In formation 

The following sections present summaries of findings and tables showing characteristics 
and findings of individual studies. Descriptions of individual studies have been withheld to meet 
the page limits of this journal, but can be seen in an online version at www.bestevidence.org. The 
web site presents reviews separately for beginning and upper-elementary reading. The web 
versions also include appendices listing all relevant studies excluded from the review and the 
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reasons for exclusion, as well as overall ratings of the strength of the evidence supporting use of 
individual programs. 



Beginning Reading 

From the first day of kindergarten to the last day of first grade, most children go through 
an extraordinary transformation as readers. If all goes well, children at the end of first grade 
know the sounds of all the letters and can form them into words, know the most common sight 
words, and can read and comprehend simple texts. The K-l period is distinct from other stages of 
reading development because during this stage, children are learning all the basic skills of 
turning print into meaning. From second grade on, children build fluency, comprehension, and 
vocabulary for reading ever more complex text in many genres, but the K- 1 period is 
qualitatively different in its focus on basic skills. The following sections summarize research on 
programs for beginning reading. 

Research on Beginning Reading Curricula 

The reading curricula category consists of textbooks for initial (non-remedial) reading 
instruction. Some professional development is typically provided with these textbooks, but far 
less than would be typical of instructional process approaches. 

Table 1 summarizes descriptions and outcomes of all studies of curriculum programs for 
beginning reading. 



TABLE 1 HERE 
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Beginning reading curricula have been evaluated in seven studies, five of which used 
randomized quasi-experiments. 

These studies evaluated three core basal reading programs, Open Court Reading, 
Reading Street, and Scholastic Phonics Readers with Literacy Place, plus three supplemental 
programs, the Open Court Phonics Kit, Phonics in Context, and Elements of Reading: Phonics 
and Phonemic Awareness. The sample size-weighted mean effect size across all seven was 
+0.12, with the four studies of core basal programs reporting a weighted mean effect size of 
+0.1 1 and the three studies of supplementary programs with a weighted mean of +0.12. Effect 
sizes averaged +0.23 for decoding measures, but only +0.09 for comprehension/total reading 
measures. 

Research on Instructional Technolog\: For Besinnins Reading 

The effectiveness of instructional technology (IT) has been extensively debated over the 
past 20 years, and there is a great deal of research on the topic. Kulik (2003) concluded that 
research did not support use of IT in elementary or secondary reading, although E. Chambers 
(2003) came to a somewhat more positive conclusion. 

Thirteen studies of instructional technology for beginning reading met the standards for 
the present review. These were divided into three categories. Supplemental technology 
programs, such as Waterford, WICAT, and Phonics-Based Reading, are programs that provide 
additional instruction at students’ assessed levels of need to supplement traditional classroom 
instruction. Mixed-method models, represented by Writing to Read, are methods that use 
computer- assisted instruction along with non-computer activities as students’ core reading 
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approach. Embedded multimedia, represented by Reading Reels, provides video content 
embedded in teachers’ whole-class lessons. 

Descriptions and outcomes of all studies of instructional technology in beginning reading 
that met the inclusion criteria appear in Table 2. 



TABLE 2 HERE 



The weighted mean effect size for all technology approaches in beginning reading was 
only +0.09 across 13 studies. A large, randomized study by Dynarski et al. (2007) and 
Campuzano et al. (2009) found no impact of five current supplemental CAI models. This study’s 
findings greatly affected the weighted mean of nine studies of supplementary CAI, estimated at 
+0.08. The weighted mean effect size for decoding measures, also substantially affected by the 
Dynarski/Campuzano findings, was only +0.05, although comprehension/total reading effects 
(not measured in the Dynarski/Campuzano study) averaged +0.20. Large effect sizes were 
reported in small, matched studies of Waterford and WICAT. Reading Reels, which uses 
multimedia embedded in teachers’ class lessons, had modest positive effects in two large 
randomized experiments (weighted mean ES=+0.20). With these potentially promising 
exceptions, research on the use of technology in beginning reading instruction does not show 
positive achievement effects of the types of software that have been most commonly used. 

Research on Instructional Process Programs for Besinnins Reading 

Instructional process programs are methods that focus on providing teachers with 
extensive professional development to implement specific instructional methods. These fell into 
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three categories. Cooperative learning programs, (Slavin, 1995, 2009) use methods in which 
students work in small groups to help one another master academic content. Phonological 
awareness training is an approach that gives teachers specific classroom strategies for building 
phonics and phonemic awareness skills. Phonics-focused professional development models, 
including Reading and Integrated Literacy Strategies (RAILS), Sing, Spell, Read, and Write, 
Ladders to Literacy, Early Reading Research, and Orton Gillingham, provide training to 
teachers to help them effectively incorporate phonics, phonemic awareness, and other elements 
in beginning reading lessons. Note that two comprehensive programs combining instructional 
process approaches with innovative curricula, Success for All and Direct Instruction, are 
reviewed in a separate section of this article. 

Descriptions and outcomes of all studies of instructional process programs meeting the 
inclusion criteria appear in Table 3. 



TABLE 3 



Effects for instructional process programs were very positive. Across 17 studies, five of 
which were randomized quasi-experiments, the weighted mean effect size for instructional 
process approaches in beginning reading was +0.37. The mean was +0.47 for decoding measures 
and +0.30 for comprehension/total reading measures. In particular, positive effects were seen on 
cooperative learning programs such as Peer-Assisted Learning Strategies (PALS) and Classwide 
Peer Tutoring (mean ES=+0.46), phonics-focused professional development programs such as 
Sing, Spell, Read, and Write, Early Reading Research, and RAILS (mean ES=+0.43), and 
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teaching of phonological awareness to kindergartners (mean ES=+0.22 on tests at the end of first 
or second grade). 



Research on Combined Curriculum and Instructional Process Approaches for Beginning 
Reading 

Evaluations of programs that provide complete curricula as well as extensive professional 
development in classroom instructional processes are summarized in Table 4. These consist of 
two programs, Success for All and Direct Instruction. 



TABLE 4 HERE 



Success for All (SFA) is a comprehensive school refonn program designed to ensure 
success in reading for children in high-poverty schools (Slavin, Madden, Chambers, & Haxby, 
2009). It provides schools with a K-5 reading curriculum that focuses on phonemic awareness, 
phonics, comprehension, and vocabulary development, beginning with phonetically-controlled 
mini-books in grades K-l. Cooperative learning is extensively used at all grade levels. 
Struggling students, especially first graders, receive one-to-one tutoring. Extensive professional 
development and a full-time facilitator help teachers effectively apply all program elements. 
Across 23 studies involving more than 12,000 children, the weighted mean effect size for 
Success for All was +0.29. On decoding measures the overall mean was +0.33, and the mean was 
+0.27 for comprehension/total reading. 

Dating back to the 1960’s, Direct Instruction (DI) is an approach to beginning reading 
instruction that emphasizes a step-by-step approach to phonics, decodable texts that make use of 
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a unique initial teaching alphabet, and structured, scripted manuals for teachers. Across three 
evaluations of Direct Instruction, the weighted mean effect size for beginning reading was +0.10. 
However, it is important to note that in other reviews that examined effects of DI in all 
elementary grades (not just K-l), this program has been rated as among the strongest in reading 
outcomes (e.g., Herman, 1999; Borman et ah, 2003; CSRQ, 2006). 

Kindergarten-Only Studies 

As noted earlier, studies that take place only during kindergarten can pose serious 
methodological challenges. Because the goals of kindergarten instruction vary a great deal from 
place to place, and have changed dramatically over the past 30 years, it is always possible that 
any experimental-control difference on an end-of-kindergarten reading measure is simply due to 
the fact that the control group was not being taught to read. Even when reading is being taught, 
kindergarten classes can vary greatly in their emphasis on phonics, so measures of word attack 
and phonological awareness can be easily inflated by programs that focus on these skills earlier 
than the control treatment does. Still, it is useful to know about kindergarten-only studies, as they 
can provide initial indications of programs worth following through to first grade and beyond. 

Thirteen studies met the standards of the review but took place only during the 
kindergarten year. These are summarized in Table 5. 



TABLE 5 HERE 



The kindergarten-only studies generally support the conclusions of the studies that follow 
children through first grade and beyond. It is important to note that many of the programs cited 
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in the main review, which tested children at the end of first grade, also reported very positive 
outcomes during kindergarten. These are also programs with a strong emphasis on phonics 
and/or cooperative learning, including Success for All (e.g., Jones et ah, 1997), and the 
phonological awareness training programs (e.g., Lundberg et ah, 1988). 

Overall Patterns of Outcomes: Beginning Reading 

Across all categories, there were 63 qualifying studies of beginning reading programs 
that posttested children at the end of first grade or later. Nineteen of the studies used random 
assignment (8 were fully randomized and 1 1 were randomized quasi-experiments). The sample 
size-weighted mean effect size was +0.22. These studies, involving more than 22,000 children, 
were identified from among more than 2000 studies initially reviewed, and represent those that 
used rigorous experimental procedures. 

Overall effects were somewhat stronger for decoding measures (such as Woodcock Word 
Attack and Letter-Word Identification) than for measures of comprehension and total reading. 
Across all studies, the weighted mean effect size was +0.27 for decoding measures and +0.20 for 
comprehension/total reading. Comprehension measures were more likely to show positive effects 
in multiyear studies that followed children into second grade or beyond. 

There are several important patterns in the findings on beginning reading programs that 
are worthy of note. First, this article finds that successful programs almost always provide 
teachers with extensive professional development and followup focused on specific teaching 
methods. In particular, most of the beginning reading programs with strong evidence of 
effectiveness have cooperative learning at their core: Success for All, Peer-Assisted Learning 
Strategies, Reading Reels, and Classwide Peer Tutoring all emphasize children working with 
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other children on structured activities. These are all forms of cooperative learning in which 
students work in small groups to help one another master reading skills, and in which the success 
of the team depends on the individual learning of each team member, the elements that previous 
reviewers (e.g., Rohrbeck et ah, 2003; Slavin, 1995, 2009; Webb, 2008) have identified as 
essential to the effectiveness of cooperative learning. 

Second, all of the beginning reading programs found to be effective or promising in 
qualifying experiments have a strong focus on teaching phonics and phonemic awareness. This is 
particularly true of Success for All, PALS, Reading Reels, phonological awareness training, Open 
Court Phonics Kits, Scholastic Phonics Readers with Literacy Place, Early Reading Research, 
Reading and Integrated Literacy Strategies (RAILS), Direct Instruction, and Phonics-Based 
Reading. It is important to note that studies of all of these programs found positive effects on 
comprehension and/or total reading measures, not just decoding measures that would appear 
more slanted toward phonetic approaches. However, an emphasis on phonics did not guarantee 
positive effects. Phonetic curricular approaches and supplemental computer-assisted instruction 
models, in particular, had minimal impacts on student outcomes. A large-scale evaluation of 
phonics-focused professional development by Garet et al. (2008) similarly found minimal effects 
for second graders. It clearly matters a great deal how reading is taught, and an emphasis on 
phonics may be necessary but it is not sufficient to ensure meaningful reading gains. 

One key implication of the Gamse et al. (2008) evaluation of Reading First is that it is not 
enough to encourage teachers to emphasize phonics, phonemic awareness, and other elements. 
The Moss et al. (2008) report that analyzed differences between Reading First and similar Title I 
schools that did not receive Reading First funding found that Reading First teachers were in fact 
spending more time teaching reading, and specifically more time on phonics, phonemic 
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awareness, fluency, vocabulary, and comprehension. The Reading First teachers were 
significantly more likely to use basal textbooks that were revisions of traditional basals designed 
primarily to increase the focus on phonics and phonemic awareness. In order of popularity in 
Reading First schools, these were Harcourt Trophies (22.5% of RF, 15.0% of non-RF), Open 
Court Reading (15.4% vs. 9.8%), Scott Foresman Reading ( 13.0% vs. 12.2%), and Houghton 
Mifflin’s Nation ’s Choice (10.7% vs 2.5%). Yet none of these had ever been evaluated at the 
beginning of Reading First, and only Open Court Reading has been adequately evaluated since 
then, in a study that found modest impacts (ES=+0.17; Borman, Dowling, et ah, 2007). If 
adopting books with more phonics and spending a few more minutes each day on the five 
elements recommended by the National Reading Panel (2000) were sufficient to improve 
beginning reading performance, the Gamse et al. (2008) national evaluation would have found 
significant positive effects. The research summarized in the present review points in a different 
direction. It supports the use of well-developed programs that integrate curriculum, pedagogy, 
and extensive professional development. 

Upper Elementary Reading Programs 

From second to fifth grade, children go through a critical transformation as readers. Most 
beginning second graders are able to decode, to recognize key sight words, to comprehend 
simple texts, and to read with some degree of fluency. The tasks that lay ahead of them, 
however, are qualitatively different from those they have navigated so far. They must consolidate 
and extend their basic skills, to be sure, and they must become fluent, confident readers. But 
most importantly, children in the upper elementary grades must become strategic comprehenders 
of increasingly sophisticated text. They must build a vocabulary of words and concepts as well as 
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a vocabulary of cognitive and metacognitive approaches to texts. While decoding skills may 
develop in a fairly step-by-step progression, the skills mastered in the upper elementary grades 
emerge as children read in many genres and leam how to make sense of what they read, a less 
straightforward process. Early decoding success is a key predictor of success in the upper 
elementary grades and beyond (e.g., Juel, 1988), yet there are many children who are successful 
decoders but poor comprehenders. This period is also distinct from the middle grades, when 
reading instruction is not typically taught as a separate subject but is subsumed in English or 
language arts. 

Because of the different objectives and requirements of the upper elementary grades, 
programs that are effective in building beginning reading skills are not necessarily optimal in the 
upper elementary grades, and vice versa. For this reason, in reviewing research on effective 
reading programs, it is important to review programs at each of these levels separately. This 
section focuses on studies of non-remedial classroom reading approaches that begin in grades 
2-5. 



Current Issues in Upper-Elementary Reading 

In recent years, reading in the upper elementary grades has taken on particular centrality 
because of the growing importance of test-based accountability. In the U.S., state accountability 
systems have long emphasized performance in grades 3-5 as the indicator of elementary school 
success, and in 2001, No Child Left Behind heightened this emphasis, requiring testing of 
reading and math in every grade from three to eight, and adding sanctions for schools not making 
adequate yearly progress. In England, Key Stage 2 assessments in reading and math in Year 6 
(age 11) are the main indicators of primary school success. 
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Despite the obvious importance of upper-elementary reading for policy and practice, 
there has never been a review of research on effective programs at this grade level. The federal 
What Works Clearinghouse (2009) has created a topic report on beginning reading programs, 
and this synthesis included studies with students up to third grade. However, the WWC excluded 
studies that included grades above 3 if they did not analyze data separately for grades above and 
below third grade, and this excluded many upper-elementary studies that included grades 2-4, 3- 
5, and so on. At this writing, the WWC has not announced a plan to do an upper-elementary 
reading review. Deshler, Palincsar, Biancarosa, & Nair (2007) published a major “research-based 
guide to instructional programs and practices” for struggling adolescent readers. It contains brief 
discussions of the research evidence supporting each of 48 widely-used programs, as well as lists 
of articles for each, and many of the articles reported studies of grades 3-6. Yet Deshler et al. 
(2007) did not attempt to synthesize or compare the evidence bases for the programs at any grade 
level. 

The review of research on upper-elementary reading programs summarized in this section 
uses methods identical to those used in the beginning reading review, except that programs had 
to have begun in grades 2-5. This synthesis groups upper elementary reading programs in three 
categories, defined previously for beginning reading programs: reading curricula, instructional 
technology, and instructional process programs. Reading curricula primarily encompass core 
reading textbooks and curricula, such as Scott Foresman’s Reading Street, as well as 
supplementary texts such as Scholastic’s Fluency Formula. Instructional technology (IT) refers 
to programs that use technology to enhance reading achievement, especially computer-assisted 
instruction (CAI). Instructional process programs are the most diverse. All programs in this 
category rely primarily on professional development to give teachers effective strategies for 
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teaching reading. These include programs focusing on cooperative learning, classroom 
motivation and management, and metacognitive strategies. Examples include Cooperative 
Integrated Reading and Composition, Peer-Assisted Learning Strategies (PALS), Exemplary 
Center for Reading Instruction ( ECR1 ), and Consistency Management-Cooperative Discipline 
(CMCD). 

Research on Upper Elementary Readins Curricula 

The reading curricula category includes 7 qualifying studies of core basal textbooks and 8 
studies of supplementary texts used as initial instruction with all students. Characteristics and 
findings of individual studies appear in Table 6. 



TABLE 6 HERE 



Both core and supplemental reading curricula for the upper-elementary grades have been 
studied in high-quality evaluations. Among 15 studies, there were five randomized experiments 
as well as four randomized quasi-experiments, involving more than 10,000 students. These 
studies found few effects on student reading achievement. The weighted mean effect size for 
core reading curricula was only +0.06, and for supplementary curricula it was +0.08, with an 
overall weighted mean of +0.06. The mean for the randomized studies and randomized quasi- 
experiments was +0.04. The only curriculum with promising effects was Open Court (average 
ES = +0.18), but in both of the studies of this program teachers received far more professional 
development than that usually provided, and in both studies Open Court was used for 2Vi hours 
per day while control students had 90 minutes of reading. 
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Research on Instructional Technology Programs for Upper Elementary Grades 



Thirty-one studies of instructional technology for grades 2-6 met the standards for this 
review. These were divided into three categories. Supplemental CAI programs, such as 
Jostens/Compass Learning, Academy of Reading, LeapTrack, My Reading Coach, and 
CCC/Successmaker provided additional instruction at students’ assessed levels of need to 
supplement traditional classroom instruction. Computer-Managed Learning Systems included 
only Accelerated Reader. This program uses computers to assess students’ reading levels, assign 
reading materials at students’ levels, score tests on those readings, and chart students’ progress, 
but students do not work directly on the computer. Innovative Technology Applications included 
Fast For Word and Light span. 

Descriptions and outcomes of all studies of instructional technology in upper elementary 
reading that met the inclusion criteria appear in Table 7. 



TABLE 7 HERE 



Among the 3 1 qualifying upper-elementary studies that evaluated various forms of 
instructional technology, eight used random assignment to treatments. The studies involved a 
total of more than 10,000 students. Overall, the sample size-weighted mean effect size was very 
small (ES=+0.06). The randomized evaluations (n=8) had a weighted mean effect size of +0.05. 
These findings support Kulik’s (2003) conclusion that effects of computer-assisted instruction in 
reading are minimal. 

None of the three categories of instructional technology programs had convincing 
positive effects. Across 25 studies of supplemental programs (such as Jostens and CCC), the 
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weighted mean effect size was +0.05. Two studies of Accelerated Reader had a mean effect size 
of + 0.06. Effect sizes were higher but samples were small in two studies of Fast ForWord, 
which had a mean effect size of +0.21, and a small study of Lightspan had an effect size of 
+0.42. 

It is important to note that there is no trend toward more positive effects of IT in more 
recent studies. Among 1 1 studies reported since 2000, the weighted mean effect size was only 
+0.06, and the large, randomized study by Dynarski et al. (2007; Campuzano et ah, 2009) found 
no significant effects of use of a variety of modem software on the reading achievement of fourth 
graders (ES=+0.02). Most of the IT studies involved use of computers as supplements to regular 
classroom instruction, usually for about 30 minutes, one to three times a week. It may be that 
more intensive uses of IT would produce more robust effects, and the study of My Reading 
Coach, which provided computerized instruction 45 minutes every day and showed positive 
effects (ES=+0.24) in a large randomized evaluation, is a hint in this direction. Another 
promising use of technology is in integrated computer and non-computer instruction, as done in 
Read 180, successfully evaluated in the middle grades (Slavin et al., 2008). However, the 
evidence summarized here clearly indicates that the types of supplementary computer-assisted 
instruction programs that have dominated the use of technology in education for thirty years are 
not producing significant effects in upper-elementary reading. Many studies of IT are of high 
quality and many of them involve large samples. It is difficult to imagine that such a large 
number of studies would fail to detect a meaningful impact if it existed. 
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Research on Upper Elementary Instructional Process Programs 



Instructional process programs are methods that focus on providing teachers with 
extensive professional development to implement specific instructional methods. In upper 
elementary reading, instructional process programs are quite diverse. Thirty-three studies, six of 
which used random assignment, evaluated a broad range of approaches. Cooperative learning 
programs (Slavin, 1995, 2009; Webb, 2008) use methods in which students work in small groups 
to help one another master academic content. 

Strategy instruction programs teach students cognitive and metacognitive skills such as 
summarization, graphic organizers, and prediction to help them comprehend text. Strategy 
instruction is often combined with other methods, especially cooperative learning and peer 
tutoring. Structured phonetic intervention programs are approaches emphasizing phonics, 
systematic instruction, and frequent assessment of student progress. Phonics-focused 
professional development programs are ones that teach teachers the NRP elements, especially 
phonics and phonemic awareness, mostly in workshops. Integrated language arts programs are 
less structured and less phonetic, and focus on integrating reading and writing, literature study, 
and pleasure in reading. Cross-age tutoring programs involve older children working with 
younger ones, and same-age tutoring involves having children take turns tutoring one another. 
Classroom management and motivation programs focus on building a positive learning 
environment. 

Descriptions and outcomes of all studies of upper elementary instructional process 
programs meeting the inclusion criteria appear in Table 8. 
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TABLE 8 



Both the methods and the findings of instructional process programs for upper- 
elementary reading were quite diverse. Across 33 experimental-control comparisons, involving 
more than 17,000 students, the weighted mean effect size was +0.21. These include four 
randomized and two RQE studies. 

Ten of the studies evaluated two forms of cooperative learning. These had a weighted 
mean effect size of +0.21. All but one of the cooperative learning studies evaluated Cooperative 
Integrated Reading and Composition (CIRC), which involves students in well-structured 
cooperative groups within which they help each other master and apply metacognitive learning 
strategies. CIRC was the basis for middle school reading programs called Student Team Reading 
and The Reading Edge, which had a weighted mean effect size of +0.29 in four secondary 
studies. The consistent positive effects of this family of cooperative learning approaches support 
the idea that programs focusing on professional development in structured activities that engage 
children in discussions about reading, giving them opportunities to help each other learn and use 
metacognitive skills, may have particular promise for enhancing reading achievement from the 
second grade onward. Positive effects were also found for cross-age tutoring programs 
(ES=+0.26 in 4 studies) and for same-age tutoring (ES=+0.26 in 2 studies), reinforcing the 
conclusion that structuring interaction among students on reading strategies is an effective 
approach. Another promising category was programs emphasizing metacognitive strategy 
instruction, such as Reciprocal Teaching and Thinking Maps, which had a weighted mean effect 
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size of +0.32 in 5 studies. In these programs, students were taught skills such as prediction, 
summarization, and self-evaluation. 

It is important to note that additional instructional process programs also showed positive 
effects, but because the studies evaluating these approaches involved small groups of struggling 
readers rather than students in general, these findings are reviewed by Slavin et al. (2009). These 
include DISTAR/ Corrective Reading, PALS, and Empower Reading. 

Overall Patterns of Outcomes: Upper Elementary Reading 
Across all categories, there were 79 qualifying studies of upper-elementary school 
reading programs involving a total of more than 32,000 students, of which 23 used random 
assignment (16 were fully randomized and 7 were randomized quasi- experiments). The overall 
sample size-weighted mean effect size was +0.13. The mean effect sizes of +0.06 for reading 
curricula and +0.06 for technology contrast with a mean of +0.21 for instructional process 
programs, such as cooperative learning and strategy instruction, reinforcing the findings of the 
beginning reading review. 



Outcomes for High Poverty Schools 

An important question for policy and practice is whether effects of various programs are 
particularly strong or weak for students in high-poverty schools. To examine this question, 
schools in each study were defined as ‘high-poverty’ if at least 50% of their students qualified 
for free or reduced-price lunches, or if other information in the study (such as a description of 
schools as serving high-poverty neighborhoods) indicated high poverty status. Forty-one 
beginning reading and thirty-one of the upper-elementary studies involved high-poverty schools, 
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by this definition. At beginning and upper-elementary grade levels, outcomes were very similar 
for high-poverty schools (mean ES=+0.15) and low-poverty schools (mean ES=+0.14). Among 
the studies of reading curricula, weighted mean effect sizes were +0.07 (n=14) for high-poverty 
schools and +0.09 (n=8) for low-poverty schools. For IT, the weighted mean effect sizes were 
+0.08 (n=17) for high-poverty schools and +0.06 (n=26) for low-poverty schools. Among 
studies of instructional process programs, including beginning reading programs that combine 
instructional process and curriculum, the weighted mean effect sizes were +0.27 (n=45) for high- 
poverty schools and +0.20 (n=31) for low-poverty schools. 

As in the overall set of studies, the studies of high-poverty schools supported the 
observation that programs that provide extensive professional development to teachers in 
specific classroom strategies are most likely to make a difference in the achievement of students 
in high-poverty schools. From a policy perspective, what these findings imply is that proven 
models could be used effectively in any type of school, but in order to reduce gaps according to 
socioeconomic status, these programs should be particularly encouraged among high-poverty 
Title I schools. 



Overall Discussion 

The research reviewed in this article provides reason for optimism about the 
improvement of basic reading instruction in the elementary grades. Sixty-three studies of 
beginning reading programs and 79 studies of upper-elementary reading programs met stringent 
methodological requirements, and these studies provide support for many replicable approaches. 
More research on a larger set of programs is needed, of course, but the research that already 
exists provides educators and policy makers with several robust approaches they could choose to 
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improve their students’ reading performance. Those programs have been shown to be effective in 
high-poverty as well as less disadvantaged schools, so if the effective programs were 
implemented with integrity by many schools serving disadvantaged students, this could 
significantly reduce achievement gaps between middle class and lower class children. The 
research also identified types of approaches that have not been successful in improving 
elementary reading performance. 

There are several important patterns in the findings that are worthy of note. First, for both 
beginning reading and upper-elementary reading, this article finds extensive evidence supporting 
forms of cooperative learning in which students work in small groups to help one another master 
reading skills, and in which the success of the team depends on the individual learning of each 
team member. In beginning reading, examples of cooperative learning included PALS, and 
cooperative learning is a key component of Success for All. In upper-elementary reading, the 
category is primarily represented by Cooperative Integrated Reading and Composition (CIRC). 
Positive effects for studies of cross-age and same-age tutoring at all grade levels also reinforce 
the value of engaging students in structured peer-to-peer interactions. The finding of positive 
effects of cooperative learning programs is consistent with the findings of reviews of secondary 
reading programs (Slavin, Cheung, Groff, & Lake, 2008) and elementary and secondary math 
programs (Slavin, Lake, & Groff, in press; Slavin & Lake, 2008). 

Also consistent with previous reviews is the finding that both alternative curricula and 
instructional technology generally produced small effects on reading measures at all grade levels. 
In particular, the evidence did not support the idea that simply introducing materials or training 
with a strong emphasis on phonics will significantly improve reading outcomes. Effects of 
adopting phonetic textbooks were very small, and a large study of LETRS, a professional 
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development program focused on phonics, also found disappointing results (Garet et al., 2008). 
These findings suggest that while phonics appears necessary in reading instruction, adding a 
phonics focus is not enough to increase reading achievement. 

The findings of this review add to a growing body of evidence to the effect that what 
matters for student achievement are approaches that fundamentally change what teachers and 
students do together every day. These programs are characterized by extensive professional 
development in classroom strategies intended to maximize students’ participation and 
engagement, give them effective metacognitive strategies for comprehending text, and strengthen 
their phonics skills. As in earlier reviews, such strategies had outcomes that were clearly and 
consistently more positive than those found for curricula or IT alone. These positive effects were 
found equally for high-poverty and low-poverty schools, and they were found on comprehension 
as well as decoding measures. More research and development of reading programs for 
elementary students is clearly needed, but this review identifies several promising approaches 
that could be used today to help students succeed in reading in the elementary grades. 
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-033 






(191E. 155Q 




End of grade 2 


-033 














4 schools 






Woodcock Word ID 


-028 












1 1,2 years 
11 weeks inK-1. 
1 year in 1st 
grade 


(2E,20 






Decodins of Real Words 


-0.64 












128 students 




High-poverty schools 
in Syracuse. NY 


Decodin’ of Non-Words 


-0'4 








Blachman « al. (1999) 


Matched (S) 


(66E.62C); 


K-l 


1 tear follow- u® 




-033 


- 


-0.33 






One year follow-up 




Woodcock Word ID 


-031 












106 students 






Decodin’ of Real Wards 


-034 














(58 E. 48 0 






Decoding of Non-Words 


-036 








Phanics-F ocused Professional Development Modeh 


Sins. Spell. Rad. Writ* 


Jones (1995) 


Matched (S) 


7 months 


4 classes 
97 stidene 
(50E. 47C) 


1 


School in 
Appalachian 
Mississippi: 
55%FL 78%W, 
22%AA 


Gaes MacGinitfe Reading 
Comprehension 




-- 


-0.21 


-0.21 


Earlv Readme Research ( ERR 





Shapiro & Soli$*(2008) 


Matched (S) 




12 schools 
(6E.6Q 


K-l 


Schools in England 


British Achievement Scabs 

Word Readme 

NFER 


-0.62 






-0.54 


‘ ! ' ln 


434 snsdents 


Word Readme 


-052 












(235E. 1990 






Accuracy 


-059 




















Comprehension 


-0.41 








Readme and Integrated Liters 


tv Strategies <RA 


ILS1 








3 schools 




Schools in small city 
in PA. 71% FL. 
94%W 


M4.T 










Secens et al. (2008) 


Matched(S) 


2 years 


(2E. IQ 


K-l 


K-l 


-039 




-0.41 


-0.41 


23' studene 
(112E, 1250 


1-2 


1-2 


-0.43 
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Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sue by 
Subgroups Measure 


Decodin? 


Comprehension 


Overall 
Effect Siie 


Ladders to Literacy 


O'Connor (1999) 


Mitclied(S) 


1 tear 


4 schools 
(2E,2C> 
105 srudents 
(64E. 410 


K-l 


Laras urban district. 
4d%AA. 51% W 


Woodcock Letter- Word ID 


-092 


-0.20 


- 


-0.20 


Woodcock Letter- Word ID 
(l->»ar folkwrop) 


-0.02 


Woodcock Word Attack 
(l-yearfollois-up) 


-038 


Ortou- Gillinekam 


Joshi et si. (2002) 


Matched(S) 


1 year 


4 schools 
56 students 
(24E, 32C) 


1 


High-poverty sdiooh 
in the SoudKwest 
81% FL. 

53% minority 


Woodcock Word Attack 


-028 


-0.28 


-0.58 


-0.43 


GMKT 


-038 


Other Professional Deve lonne at M ode h 


Four Blocks 


Scifcelli & Morgan ( 1999) 


Matched (9 


1 tear 


55 studene 
<25 E, 30 C) 
in 4 classes 
(2C,2E) 


1 


Tide I school in 
Virginia Beach. VA 


GMRT 




- 


-0.56 


-0.56 



Note: L=large study will it least 250 students; S=strt£ll study trith teas than 250 students: E=£xperimer.:-1: OControl; MAT^MetropoHtsn Achievement Test TERA=Testof Early Reading Ability TOWRE=TestofWoed 
Reeding Efficiency; DORT=Durrell Oral Reading Te sc CRIRT=<jses-hi£cGtnitte Reading Test; FL=Eree reduced-price lunch; W=SMiite; AA=African American H=Hispanic. 
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Table 4: Curriculum + Instructional Process Programs in Beginning Reading 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzion 


O era 11 
Effect 
Size 


Success for All 


Borman at al. (2007) 


Randomized (L) 


3 yaan 


35 schools 
210S students 
(10S5E, 1023 C) 


K-2 


Title I schools 
throughout die U.S.. 
72%FL, 57% AA, 
31% W, 10% H 


Woodcock 




i 

k> 

oo 


-0.21 


-0.25 


Word Identification 


-0.22 


Word Attack 


-0.33 


Passage C omprehension 


-0.21 


Corranti (2009) 


Matched (L) 


4 years 


115 schools 
(3QE, 85C) 
3783 students 
(831E2932C) 


K-3 


High poverty schools 
in 17 states. 

69% FL, 52% A A. 
22%W, 19%H, 6% 
Asian 


Terra Nova 








-0.43 


Madden at al. (1993): 
Slavinetal. (1993) 


Matched (L) 


5 vaars 


10 schools 
(5 E. 5 C) 
1925 studants 
(S90E, 1035 C) 5 
cohorts 
(1st grade in 
experiment I year. 
2nd grade 2 years, 
etc.) 


1-5 


Afh can American 
students in high- 
poverty schools in 
Baltimoie. MD 


Average of Woodcock. 
DORT. and CTBS 




-0.55 


-0.39 


-0.46 


1st grade 


-0.55 


2nd grade 


-0.32 


3rd grade 


-0.49 


CTBS 




4th grade 


-0.45 


5 th grade 


-0.4S 


Nunnary at al. (1996) 


Matchad (L) 


2 yaats 


64 schools 
(46E, 18C) 
1555 students 


1-2 


High-poverty schools 
in Houston. TX 
79%FL, 52%H, 
48%AA 


Average of Woodcock 
and DORT 




-0.09 


-0.02 


-0.05 


First cohort (Gr. 2) 


-0.08 


S econd cohort (Gr. 1) 


-0.09 


Spanish (Gr. 1) 


-0.21 


Livingston & Flahartv(1997) 


Matched (L) 


2 years 


6 schools 

(3 E, 3 C) 

3 cohorts: 
English speakers 

(272E, 1S4C) 

Spanish bilingual 

(87 E, 93 C) 
Other ESL 
(80 E 112 C) 


1,2 


High -poverty 
multilingual schools 
in Modesto and 
Riverside. C A 


Average of Woodcock and 
DORT acro&s cohorts 




-0.49 


-0.49 


-0.49 


Eng lis h-Dominant 


-0.28 


S panis h B ilingu al 


-0.77 


ESL 


-0.43 


Ross et al. (1996) 


Matched (L) 


1 year 


4 schools 
(2E2C) 
540 students 
(169 E, 371 C) 


1 


Mosd y His pa me 
schools in 

Amphitheater District 
near Tucson. A Z 


Average of Woodcock and 
DORT 




-0.62 


-0.33 


-0.47 
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Study 


Design 
L arse Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Poattezt 


E ffec t Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenuon 


OreraU 

E ffect 
Size 


Jones et al . (1997) 


Matched (L) 


3 years 


2 schools 
(IE, 1C) 
49S students 
(339E, 159C) 
Cohort 1: 
172 students 
(113E, 59C) 
Cohort 2: 
157 students 
(109E, 4SC) 
Cohort 3: 
169 students 
(117E, 5X) 


3 Cohorts: 
Cohort 1: 
K-3 

Cohort 2: 
K-2 

Cohort 3: 
K-l 


Hizh-poverty A A 
schools in Charles ton, 
SC 


Woodcock 




-023 


-0.02 


-0.27 


Kinder® art an 


-0.98 


Woodcoc k and D ORT 




1st grade 


-0.20 


SATorBSAP 




1st grade 


-0.03 


SAT 




2nd grade 


-0.10 


SAT 




3rd grade 


-0.06 


B . Chambers et al. (3005) 


Matched (L) 


1 year 


S schools 
(4 E, 4C) 
455 students 
(3 HE, 144C) 


K-l 


MosdyHispanic 
communities in the 

US 


WoodcockReading 
Mastery Test 




-020 


-0.21 


-0.20 


Ross. Smith. & C asey (1992) 


Matched (L) 


3 years 


2 schools 

(IE, 1C) 
370 students 
(223E, 147C) 

3 cohorts 


1-3 


Rural schools in 
Caldwell, ID 


Average of Woodcock and 
DORT 




-0.10 


-0.11 


-0.10 


Ross & Casev(199Sb) 


Matched (L) 


2 sears 


8 schools 
(3E, 5C) 
356 students 
(151E, 205C) 


K-l 


High -poverty schools 

in Ft. Wayne, IN; 
75%FL. 45% minority 


Woodcock 




-033 


-0.17 


-0.25 


Word Identi a cation 


-0.22 


Word Attack 


-0.45 


Passaze C omprehension 


-0.14 


Durrell Oral 


-0.21 


Munoz & Dossett (2004) 


Matched (L) 


3 years 


6 schools 

(3 E, 3 C) 
349 students 
(217 E, 132 C) 


K-3 


High-porerty schools 
in Louisville, KY 


CTBS 




- 


-0.15 


-0.15 
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Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


Effect Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzkm 


Or era 11 
Effect 
Size 


Dianda & Flaherty < 1995) 


Matched (L) 


2 yean 


6 schools (3E, 3C) 
319 students 
(131 E. 1SSC) 


1 


Mostly Hispanic 
students in schools in 
California 
72% FL, 42%H, 
34% \V 
32%ELL 


Woodcock 




-0.41 


-0.45 


-0.42 


Letter- Word Identification 


-0.46 


Wotd Attack 


-0.36 


Passage Comprehension 


-0.45 


Woodcock (all three 
measures) 




English speakers 


-0.55 


Spanish bilingual 


-0.84 


Spanish dominant 


-0.82 


Non-English speakers 


^0.11 


Ross & Casey(199Sa) 


Matched (L) 


1 year 


4 schools 

(2 E, 2 C) 
316 students 
(156 E, 160 C) 


1 


Suburban schools in 
Portland, OR 


Average of Woodcock and 
DORT 




0.00 


0.02 


0.01 


Ross. Smith & Casey ( 1997) 


Matched (L) 


2 years 


Cohort 1 : 
135 students 
(94E. 41C) 
Cohort 2: 
146 students 
(106E. 40C) 


K-l 

1-2 


Hiz h-poverty schools 
in Clarke Co., GA 


Average of Woodcock and 
DORT 




-022 


-o.os 


-0.15 


1st grade 


-0.27 


2nd grade 


-0.03 


Ross st al. (1995) 


Matched (L) 


3 yean 


2 schools 

3 cohorts 
251 students 

Cohort 1 : 
59E, 47C 
Cohort 2: 
54E. 20C 
Cohort 3: 
45E. 32C 


K-4 


Tide I schools in F t. 
Wayne, IN 


Average of Woodcock and 
DORT 




-0.09 


0.09 


0.00 


2nd grade 


+0.10 


3rd grade 


-0.10 


4 th grade 


0.00 


Casey etal. (1994) 


Matched (S) 


lyear 


3 schools 

(2 E. 1 C) 
1S9 students 
(116 E, 73 C) 


1 


Hie h-poverty African 
American schools in 
Memphis . TN 


Woodcock 




-0.78 


-0.53 


-0.65 


Word Identification 


-0.52 


Word Attack 


-1.03 


Passage Comprehension 


-0.63 


Durrell Oral Reading 


-0.42 
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Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


E ffec t Sizes by 
Subgroup 1 
Measure 


Decoding 


Cbmp r eh en.no n 


Overall 

Effect 

Size 


Ro6s. Smith. &Bond (1994) 


Matchad (S) 


2 yean 


Cohort 1: 

4 schools 
133 students 
(65E, 6SC) 
Cohort 2: 

2 schools 
46 students 
(2QE. 26C) 


K-l 

1-2 


African American 
students in high- 
poverty schools in 
Montgomery. AL 


Average of Woodcock and 
DORT 




-0.76 


-0.47 


-0.62 


K-l Cohort 


-0.39 


1-2 Cohort 


+1.15 


Smith et al. (1994) 


Matched (S) 


4 years 


2 schools 
142 students 
(74E, 6SC) 
4 cohorts 


14 


High poverty AA 
school inMerrphis 


Average of Woodcock 
and DORT Gray 




-0.55 


-0.65 


-0.60 


1st trade 


-1.15 


2nd trade 


-o.os 


3rd trade 


-0.56 


4th trade 


-0.04 


Wasik& Slavin (1993) 


Matched (S) 


3 years 


2 schools 
(IB, 1C) 

3 cohorts 


1-3 


High-poverty schools 
in Charleston, SC. 
40*. FL: 60%AA 


Average of Woodcock and 
DORT 




-039 


-0.39 


-0.39 


1st trade 


I 

O 

i J 
o 


2nd trade 


-0.67 


3rd trade 


-0.30 


S lavin & Madden (1991) 


Matched (S) 


2 years 


2 schools 

(1 E. 1 C) 
10S students 
(58 E, 50C) 


1-2 


Small rural t<*rn in 
Mainland 
40*oFL. 503.AA 
50*/.W 


Average of Woodcock and 

DORT 


-0.02 


-0.02 


-0.02 


-0.02 


CTBS 


-0.02 


Wang & Rou (1999a) 


Matched (S) 


lvear 


4 schools 
(2 E. 2 C) 
97 students 
(50 E, 47 C) 


1 


High -poverty schools 
in Little Rock. AK 


Average of Woodcock and 
DORT 




-020 


-0.39 


-0.30 


Wang & Rou ( 1999b) 


Matchad (S) 


1 year 


2 schools 
(1 E. 1 C) 
82 students 
(43 E. 39 C) 


1 


High-poverty mostly 
Hispanic schools in 
Alhambra Distict near 
Phoenix. AZ 


Average of Woodcock and 
DORT 




-0.15 


-0.16 


-0.15 


S lavin & Madden (199S) 


Matched (S) 


3 years 


50 students 
(21E. 29C) 


1-3 


S pani sh-dominant 
LEP students in 
Philadelphia, PA who 
had transitioned to 
English classes 


Woodcock 




-036 


-0.07 


-0.22 


Word Attack 


-0.65 


Word Identification 


-0.06 


Pauage C omprehension 


-0.07 
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Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample 

Characteristics 


Posttest 


I flee t Sizes by 
Subgroup 
Measure 


Decoding 


Comp rehenzkm 


Or era 11 
Effect 
Size 


Direct Instruction 


Kennedy (1978) 


Matched (L) 


4 yean 


2216 children 
(1161E ; 1055C) 


K-3 


High poverty schools 
ink RL i : & MS 


MAT Reading 
C amprehensian 




- 


-0.07 


-0.07 


Mac Rer et al . (2003) 


Matched (L) 


4 yars 


12 schools 
(6 E. 6 C) 
275 students 
(171 E, 1040) 


K-3 


High-poverty schools 
in Baltimore, majonty 
African-American 


CTBS 




- 


-0.13 


-0.07 


Read ins Comprehension 


-0.13 


V ocabulary 


0.00 


Grant (1973) 


Matched Post 
Hoc (S) 


2 years 


2 schools 
78 students 
(39E. 39C) 


K-l 


High-poverty African 
American students in 
\VI 


Wis consi n R eading S kill 
Development 




-0.84 


- 


-0.84 


Lone Vowels 


-0.64 


Base Words 


-1.33 


Dale Johnson Word 
Recoenition 


-0.54 



Note: L= large study with at least 250 students: S=sraall study with less than 250 students; E= Experimental: C=Control; DORT=Durrell Oral Reading Test CTB S =C oniprehenst re Test of Basic 
Skills: S AT=Scholastic Achievement Test BSAP=Basic Skills Assessment Program: MAT= Metropolitan Achievement Test FL=Free reduced-price lunch; \V=\Vhite; AA= African American; 
H=His panic; ELL=English language learner - 
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Table 5: Kind erg art en-Onh' Studies 


Study 


Design 

Large'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizes by 
Sub gro up/M ea s ure 


Overall 
Effect Size 


Reading Curricula 


Superlads 








43 classes 






SAT-10 






Borman & Dowling (2007) 


Xfitched (I) 


1 war 


(23E.20C) 




Schools thorughout the 


Sounds and Letters 


+0.25 




750 students 




U.S., 52% minority 


Word Reading 


+0.14 


+0.20 








(400E, 350C) 






Sentence Reading 


+0.22 










43 classes 
(21E, 22C) 
750 students 
(302E, 36SC) 






UBS 














Schools thorughout the 


Word Analysis 


+0.41 




DAgostino (2009) 


Xfetched (L) 


1 war 


K 


U.S., 47% FL, 


Reading Words 


+0.23 


+0.23 










38% minority’ 


Reading Comprehension 


+0.24 












Vocabulary 


+0.02 




Yovaeer Universal Literacy 


Frechdmg et al. (2006) 


Xfctched a) 


1 war 


8 schools 
(4 E. 4 C) 


Y 


Afric an Americ an 
students in 8 urban 
schools 


Woodcock 




+0.67 


39S students 




Word ID 


+0.21 








(202 E, 196 C) 




Word Attack 


+1.11 










3 schools 






Woodcock 














High-poverty schools in 


Word ID 


-0.10 




Hecht(2003) 


Xfetched(S) 


5 months 


(1 E, 2 C) 


K 


Word Analysis 


+0.10 


-0.02 


213 students 
(101 E. 112 C) 


Orlando 


DIBELS 














Nonsense Word 


-0.07 




Instructional T echnologv 


Waterford Earh' Reading Program 


Paterson et al. (2003) 


\fetched (L) 


1 year 


16 classes 
(8E. 8C) 


K 


High-poverty c ommunt ty 
in we stern New’ York 


Clay Word Recognition Test 




0.00 


Tracev& Young (2006) 


Xhtched (L) 


1 year 


15 classes 
(SE, 1C) 
265 children 
(151E, 114C) 


K 


High-minority’ 
northeastern community’ 


TERA-2 




+0.47 
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Study 


Design 

Larse'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizes by 
Subgroup/Measure 


Overall 
Effect Size 


The Literacy Center (L eapFroe) 




Randomized 




6 schools 




High-poverty' schools in 
Us Vegas, 30% ELL 


Gates McGinitie 


+0.17 




RAC (2003) 


Quasi- 

Experiment (S) 


1 year 


25 S students 
(126E, 132C) 


K 


E4BELS 


+0.12 


+0.14 


Destination Reading 








15 classes 
(SE, 7C) 




High-porerty high- 


DIBELS 


-0.56 




Barnett (2000) 


\fetched (L) 


1 year 


K 


minority' communi ty r in 


Clay r Word Recognition Test 


-0.47 


-0.53 










FL 


Dolch 


-0.56 




Wr it ing to Read 


Stevenson et al. (19SS) 


Matched (S) 


1 year 


241 students 
(S6E, 155C) 


K 


African American 
students in Washington, 
DC 


MAT Rea ding 




+0.35 


Granick & Reid (19S7) 


Mitched(S) 


1 year 


2 schools 
73 students 
(37E, 36C) 


K 


High-poverty African 
American schools in 
Baltimore 


MAT 




+0.02 


Instructional Proc 


ess Programs 


Ladders to Literacy 
























S schools 
(4E.4C) 

404 students 
3 groups: 
Ladders only: 
11 teachers, 
136 students; 

Ladders + PALS: 
11 teachers, 
133 students; 

Control: 

11 teachers, 
135 students 






Ladders to LiteracvGroup 
















End of kindergarten 
















Woodcock 
















Word Attack 


+0.17 














Word ID 


-0.25 














Followup to Fall of first grade 








Randomized 

a) 


20 weeks. 




Tide I andnon-Title I 


Word Attack 


+0.38 




Fuchsetal. (2001) 


with a one- 


K 


kindergartens in 


Word ID 


+0.05 


+0.21 




year followup 




Nashville. TN 


Ladders - PAL S Group 
















End of kindergarten 
















Word Attack 


+0.36 














Word ID 


+0.25 














Followup to Fall of first grade 
















Word Attack 


+0.41 
















Word ID 


+0.43 
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Study 


Design 

Large'Small 


Duration 


N 


Grade 


Sample Char acter is tks 


Posttest 


Effect Sizes by 
Subgroup/Measure 


Overall 
Effect Size 


OConnor(1999) 


Notched (L) 


1 year 


17 classes 
(9E, SC) 
31S students 
(192E, S9C) 


K 


Rural midwestem 
district 100% White 


Woodcock Johnson Letter Word 
ID 




+0.43 


Typical children 


+0.33 


At-risk children 


+0.6S 


Little Books 


Phillips etal. (1990) 


Randomized 

Quasi- 

Experiment (L) 


1 year 


IS classes 
309 students 


K 


Urban and rural schools 
in Newfoundland. 
Canada 


\ET 




+022 


School -home 


+0.33 


School only 


+0.19 


Hone only 


+0.14 



Note: L=iarge study with at least 230 students: S=small study with less than 250 students; E=Experimental; C=ControL ITBS: Iowa Test of Basic Skills; SAT-10: Stanford .Achievement Test; TERA=Test 
of EarlvReading Ability MAT=\fetropolitan Achievement Test FL=Free reduced-price lunch; W=White; AA= African American; H=Hispanic; ELL=English language learner- 
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Table 6 

Upper E lauentary Readme Curricula 


Study 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristic- 


Poettest 


E ffect Size by 
Subgroup 
Measure 


Overall E ffect Si* 


CoreBaeal Program: 


Open Court 








5 schools 




Hizh-povertv schools in 


Terra Nova 






Borman. Dor.' line. & Schneck (2007) 


Randomized (L) 


1 year 


33 teachers 


2-5 


ID, FL, NC, TX 


Comprehension 


-0.15 


-0.15 


(1SE, 15C) 


77%FL 73% minority. 


C onposite 


-0.15 








613 students 




il%ESL 


Vochulary 


-0.13 
















SAT-9 






Skindiud & Gersten. 2006 


Matched (L) 


2 years 


434 students 
(292 E s 142 C) 
Grade 3 cohort: 
642 students 
(350 E. 292 C) 


2-3, 3-4 


High- poverty schools in 
Sacramento 


Grade 2-3 cohort 


-0.30 


-0.20 










Grade 3-4 cohort 


-0.10 




Readme Street 












3 middle class schools; 2 


Gates MacGimhe 






Wilkerson. Shannon. & Harman 
(2006) 


Randomised (L) 


1 year 


5 schools 
32 teachers 


2-3 


Title L high poverty 
schools. 54% FL. 57%W. 


2nd grade 


-0.10 


-0.06 












25%AA, 1 1%H 


3rd grade 


-0.01 




Wdkerscn. Shannon. Sc Harman 






40 taa&ars 




4 schools nadamvide. 


Gates Ma<<iinitk 






Randomised (L) 


1 year 


793 students 


2-3 


86% W, 3%AA. 


2nd grade 


-0.14 


-0.04 


(2007) 


(409E, 3 SC) 




263. FL 


3rd grade 


-0.06 




Houghton Mifflin Readme 








10 schools 
(5E. 5C) 

2 Cohorts: 
Cohort 1: 
5S6 students 
(22QE. 326C) 






FIBS 












Cohort 1: 
Grades 2-3 
Cohort 2: 
Grade 3 




Cohort 1 










Cohort 1: 


MosdvAA schools in 


Readme 


-0.08 




Sxrartz & Johnson (2003) 


Matched (L) 


2 years 

Cohort 2: 




Vocabularv 


-OSS 


-0.11 


94% FL 76% AA. 


Total 


-0.15 






1 year 


16% W. 9%H 


Cohort 2 










Cohort 2 




Readme 


-0.04 










46 y students 






Vocabularv 


-0.17 










(9 IE, 374C) 






Total 


-0.07 




1 

I 


Comer. Greene. & Munroe (2004) 


Matched (L) 


1 year 


63 schools 
(18E.45C) 
12.832 students 
(3.928 E. 8,904 C) 


3-5 


High- poverty schools in 
Philadephia 


Terra Nova 




-0.10 
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Study 


Deeien 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Pastiest 


E ffect Size by 
Subgroup 
Measure 


Overall E ffect Sii 


Whole L ansuas e B aaafe 


Ri^v 














Gates -MaflGinitie 


















S econd Graders 












4 schools 






Word Decodine 


-0.22 












Hi»h-povertv schools. 

80% FL. 57% AA 


Word Knowledge 


-0.07 




Wilkerson (2004) 


Matched (L) 


32 weeks 


(2 E. : C) 


2 and 4 


Comprehension 


-0.23 


-0.26 


472 students 


29% H. 5%W 


Total 


-0.03 








(245 E, 227 C) 




Fourth Graders 


















Vocabulary 


-0.61 
















Comprehension 


-0.33 
















Total 


-0.4S 




Supplanentarv Currie uh 


Schoohvide E nrichmou Readme Model 


Reis. Eckert. NkCoach. Jacobs. & 
Coyne (2008) 






31 teachers 
(17E, 14 C) 
544 students 
(306 E. 238 C) 




2 middle-class schools in 
New England tew ns 36% 


Oral Reading Fluency 


-0.08 




Randomized (L) 


14 w eels 


3-5 


FI, 64% W, 28% H, 3% 
AA, 3% Asian, 18% 

LEP 


ms 


-0.15 


-0.12 


I lenient :: of Readme: Conprehenzian 














Gates -Ma>dG ini tie 


















VocAularv 


-0.21 










1 8 teachers 
(10E. 8C) 
413 students 
(229E. 1S4C) 




Schools in AZ, KY, VA, 


Comprehension 


-0.11 












and OR. 


Total 


-0.17 




Re send ez. Sridiharan. fc Azm (2006) 


Randomized (L) 




3 


69% FL, 36%W, 28% H, 


ERDA 




-0.09 










20% AA, 


Tar set Words in Context 


-0.05 












6% Native American 


Narrative Passase Fluency 


-0.03 
















Informational Passaze Fluencv 


0.00 
















Readme Compr tension 


-0.12 




Elements of Readme: Vocabularv 








7 schools 
26S students 

(147E, 121C) 




High-fiovefty schools in 


Gates -McGini tie 






Apthorp (2005a) 


Randomized Quasi 


1 \ear 


3 


AL and NY. 


VocAulary 


-0.21 


-0.10 


experiment (L) 


83% FL, 49% AA, 


Comprehension 


-0.10 










46% W, 10% LEP 


ERDA Sight VocAulary 


0.00 




Elements of Readme: Fluency 








10 classes 
1S4 students 
(97 E, 87 C) 




Maj on ty White, high- 


ERDA 








Randomized Quasi 
experiment (S) 






poverty Tide I schools 


Word Identification 


0.00 




Apthorp (2005b) 


1 >ear 




74% FL, 82% W, 12% 


Narrative Passage Fluencv 


-0.15 


-0.10 








AA, 4% H, S% LEP 


Informational Passage Fluencv 


-0.18 
















Gates McGinitie C arr*?tehension 


-0.05 
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Study 


DeEign 
Large Small 


Duration 


N 


Grade 


Sample Characteristic 


Poett€S3t 


E fleet Size by 
Subgroup 
Measure 


Overall E ffect Si* 


Fluencv Formula 




S ivin- Kachala & Bialo (2005) 


Randomized Quasi 
experiment (S) 


1 year 


12 classes 
12S students 
(66E. 62C) 


A 


Suburban districts in Long 
Island. NY. 

20% FL 7%LEP 


Woodcock Pas sage Comprehension 




-0.24 


•Jacob' a Ladder 




S tambaugh (2007) 


Mulcted (S) 


12 weeks 


2 schools 


3-5 


Rural high-po*erty- 
schools in OH. 
27% FL 


fibs 




-0.02 


Contextual!* -Baaed Vocabulary Instruction 


Nelson & S tage (2007) 


Randomized Quasi 
experiment (L) 


3 months 


16 classes 
(SE. 8C) 
308 students 
(159E. 149C) 


3.5 


S chools in midwestem 
district. 

32% FL 70%W, 24% H. 
24*/. LEP 


Gates -MacGinitie 




-0.15 


Comprehension 


-0.27 


Vocabulary 


-0.03 


OuickReedz 


Huxlasr(2006) 


Matched (S) 


12 weeks 


4dasses 
(2E. 2C) 
61 students 
(35E. 26C) 


3 


High-poverty suburban 
school. 

69% FL. 63% AA. 
33%W 


Gates -MacGinitie 




-0.24 


Comprehension 


-0.32 


Rate 


-0.30 


Accuracy 


-0.42 


TOWRE 




Sight Wad 


-0.13 


Decoding 


-0.12 



Note: L=large study -with at least 250 students; S=small study with less than 250 students; E=Experimental; C=Control; SAT-9=Stanford Achievement Test 9th Edition; l'lHS-Icwa Test of Basic Skills; ERDA-Eadv Reading 
Diagnostic Assessement. FL=Fxee reduced -price lunch; White AA= African American; H=Hispamc; LEP= Limited English Proficient. 
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Upper E 


Table 7 

ementarv Technology Programs 


Study 


Design 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Size by 
Subgroup/ 
Measure 


Overall Effect 


Large Small 


Size 


Su 


pplementalCAI Programs 


Academy of Reading 


















Campuzano etal. (2009) 


Randomized (L) 


1 year 


41 teachers 
(22E, 19C) 
S99 students 
(495E, 404C) 


4 


Schools across the U.S. 
65%FL, 54%AA, 29%H, 
17%W 


SAT-10 




-0.01 


Leap Track 


















Campuzano, et al. (2009) 


Randomized (L) 


1 year 


55 teachers 
(29E,26C) 
1274 students 
(665E, 609C) 


4 


Schools across the U.S. 
61%FL. 57/4AA, 33%W, 
10%H 


SAT-10 




+0.09 




















Jostens (Earlier form of C 


imp ass Learning) 


Aliffangis (1991) 


Randomized (S) 


1 year 


12 classes 
(6 E, 6 C) 


4-6 


School at an army base near 
Washington, D.C. 37% 
minority. 


CTBS Reading 




+0.15 


4th grade 


+0.30 


5th grade 


+0.20 


6th grade 


-0.04 


Becker (1994) 


Randomized (S) 


1 year 


1 school 
1S7 students 


2-5 


Inner c ity B altimore 
High poverty'. 


CAT 




+0.09 


Standi sh (1995) 


Notched (S) 


1 year 


2 schools 
139 students 
(56E.S3C) 




S tudents in suburban DE 


NEAT 6 Reading 
Comprehension 




+0.05 


Estep (1997) 


It fetched post hoc (L) 


4 years 


106 schools 

(53E, 53C) 


3 


El emenlaty schools in IN 


ISTEP 






Reading Vocabulary’ 


+0.03 


+0.03 


Reading Total 


+0.03 


Clariana (1994) 


Nfetchedpost hoc (S) 


1 year 


85 students 
(47E. 3 SC) 


3 


School in a predominantly 
White, rural area. 


CTBS 




+0.20 


Compass Learning 


Kadel Research Consulting 
(2006) 


\fetehed post hoc (S) 


2 years 


138 students (69 
E, 224 C) 


4-5 


Garfield Heights, OH 
50% FL, 63% W, 24% H 
13% AA 


OAT 




+0.29 


1 year 


-0.10 


2 years 


+0.29 


CCC Sue cessm alter 


Campbell (2000) 


Matched (L) 


1 year 


13 schools 
(7E.6C) 
701 students 
(310E, 391C) 


4-5 


Middle class students in 
Etowah, AL 


SAT 






Comprehension 


-0.09 


-0.02 


Vocabulary' 


+0.04 
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Studv 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Szebv 
Subgroup/ 
Measure 


Overall Effect 
Size 


Ragosta (1983) 


Matched (L) 


3 wars 


6 schools 
(4E, 2C) 
Eight 1-war 
cohort 
Three 2 -war 
cohorts 
One 3-war 
cohort 


4-6 


High poverty schools in Los 
Angeles 


CTBS 






One \ear 


+0.17 


Comprehension 


+0.23 


Vocabulary 


+0.25 


Two war s 


Comprehension 


-0.01 


Vocabulary 


+0.17 


Three wars 


Comprehension 


-0.24 


Vocabulary 


+0.58 


Saracho (1982) 


Matched (L) 


1 war 


256 students 
(128E, 128C) 


3-6 


Spanish-speaking migrant 
students 


CTBS Reading 




-0.09 


3rd 


-0.04 


4* 


-0.25 


5 th 


+0.16 


6th 


-0.17 


Glassworks Gold 


Whitaker (2005) 


Afetchedpost hoc (S) 


1 war 


2 schools, 
218 students 


4,5 


Schools in rural Tennessee 
62%LowSES. 


TCAP 




-0.14 


4th 


-0.10 


5th 


-0.19 


Mv Reading Coach 


Vaughan, Serido, & 
Wilhelm (2006) 


Randomized (L) 


1 war 


4 schools 
284 students 
(127E, 157C) 


2-4 


Pre domin ate ly mi noritv 
students from 4 schools in 3 
states; 

27%ELLs, 36% AA. 36% 
H, 22% W 


GRADE 




+0.24 


Vocabulary' 


+0.24 


Comprehension 


+0.22 


WICAT 


Mller (1997) 


^fetched post hoc (L) 


3 wars 


30 schools 
(10E, 20C) 


3-5 


NYC Public Schools; 
Pre domi nantly Afric an 
American and Hispanic . 
17%ESL 


DRP 




+0.02 


Clayton (1992) 


Iv&tched post hoc (L) 


1 war 


5 schools 
(IE. 4C) 
426 students 
(1S1E.245C) 


2-5 


Schools in northwest SC 
46% FL, 59%W, 39®i> AA 


CTBS 




-0.01 


\fcs & Petrie (1988) 


Matched post hoc (L) 


3 wars 


4 schools 
(IE, 3C) 
257 students 
(8 IE. 176C) 


24 


Schools in Dearborn, MI 


TIBS 




-0.15 
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Studv 


Design 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizebv 
Suberoup/ 
Measure 


Overall Effect 


Larse'Small 


Size 


Od en Book to Literacy 


Williams (2005) 


\fetched(S) 


1 year 


2 schools 
(IE. 1C) 
127 students 
(66E. 610 


4 


High-poverty schools in 
Memphis; 

51% W, 24% H, 21% AA 


TORC 




+0.28 


Other Sup p lent entalCAI 


Becker (1994) 


Randomized (S) 


1 year 


9 classes 
199 students 


2-5 


Schools in inner city' 
Baltimore 
50% FL. 99% AA 


CAT 




+0.06 


Easterling (1982) 
(McroSvstem 80) 


Randomized (S) 


4 months 


2 schools 
42 students 
(21E, 21C) 


5 


Schools in suburban school 
district 


CAT Reading Comprehension 




+0.05 


Schmidt (1991) 
(Wasatch US) 


\fetched (L) 


1 year 


4 schools 
(2E.2C) 
1,224 students 
(646E.57SC) 


2-6 


Schools in Southern CA 
25% FL 


CTBS 




+0.04 


Cooperman (1985) 


Matched (L) 


1 year 


3 schools 
(1E.2C) 
470 students 
(204E, 266 C) 


2-4 


S tudents from 3 low to 
middle class schools. 
86% W, 13% AA 


CAT 




-0.06 


Bryg (1984) 


\fetched (S) 


15 weeks 


9 teachers 
(5E.4C) 
152 students 
(S3E. 69C1 


4 


Schools in Omaha, NE 


CAT Reading 
Comprehension 




+0.20 


Roth& Beck (1987) 


Mitched(S) 


1 year 


6 classes 
(3E, 3C) 
108 students 
C59E. 490 


4 


Hig h-poverty low-achieving 
urban schools 
100% AA 


Woodcock Word Attack 


+0.60 


+0.38 


CAT Vocabulary 


+0.53 


CAT Reading Comprehension 


0.00 


Coomes (1985) 


\ fetched (S) 


1 year 


4 schools 
102 students 
(5 IE. 510 


4 


Middle class schools in TX 
90% W 


CTBS 




+0.02 


Hoffman (1984) 


Mitched(S) 


1 year 


3 schools 
96 students 
(5 IE. 450 


3 


Schools in suburban 
midwest 
11% minority 


Gates MacGinitie 




-0.07 


Comprehension 


-0.04 


Vocabulary' 


-0.10 


Levy(19S5) 


\fetched post hoc (L) 


1 year 


4 schools 
581 students 
(293E. 2S8C) 


5 


Suburban NY school 
district 


SAT 




+0.19 
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Studv 


Design 

Laree'Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Szebv 
Subgroup/ 
Measure 


Overall Effect 
Size 


Computer-Managed Learning Svstems 


Accelerated Reader 














DRS 
















Low SES students in a 


Vocabulary 


+0.25 




Knox (1996) 


Randomised (S) 


3 months 


77 students 


3-4 


southeastern state: 


Comprehension 


-0.13 


-0.03 


(40E, 37C) 


72% FL. 79% W. 13% AA. 


SAT 














$%H 


Vocabulary 


-0.07 
















Comprehension 


-0.17 




Ye e (2007) 


Matched (L) 


1 \ear 


3 schools 
(1E.2C) 
2072 students 
(612E. 1460C) 


2-5 


Majority Hispanic schools 
in Los Angeles: 

92% FL, 79% H 17% AA, 
61% ELL 


CST 




+0.06 


Innovative Technologv ADDlications 


FastForWord 


Nferion (2004) 


Matched (L) 


1 year 


349 students 
(215E, 134C) 


5-6 


Schools in Appalachian TN 
52% FL, 100%W 


Terra Nova 




+0.25 








142 students 
(55E, S7C) 




Middle class schools in 
Northwest OH 


Gates Mac Gi nine 






Scientific Learning (2006) 


Nhtched(S) 


15 weeks 


5-6 


Comprehension 


+0.12 


+0.11 










Vocabularv r 


+0.11 




Lightspan 








101 students 
(50E, 51C) 




Schools in the Caesar 


SAT 






Birch (2002) 


\htched post hoc (S) 


2 vears 


2,3 


Rodnev School District in 


Vocabulary 


+0.59 


+0.42 










EE 


Comprehension 


+0.25 





Note: L=large study with at least 250 students; S=small study with less than 250 students; E=Experimental; C=Control; CTBS=Comprehensive Test ofBasic Skills: CAT=Califomia Achievement Test 
CST= California Standards Test; MAT=Metropolitan Achievement Test ITBS=Iowa Test of Basic Skills; IS TEP= Indiana Statewide Testing for Educational Progress; OAT=Ohio Achievement Test 
TCAP=Tennessee Comprehensive Assessment Program; GRADE=Group Reading Assessment and Diagnostic Examination; DRP=Degrees of Reading Power; \VRAT=\Vide Range Achievement Test 
SAT=Scholastic .Achievement Test DRS=Diagnostic Reading Scales: FL=Free reduced price lunch; W= White, AA= African American, H= Hispanic. ELL =English language learners; LEP= Limited 
English Proficient 
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Table 8 

UpperE lementarv Instructional Proces Programs 


Studv 


Design 
Large Small 


Duration 


X 


Grade 


Sample Characteristics 


Posttest 


Effect Size br 
Sub group Measure 


Overall 
Effect Size 








Coopera rh e L ea rains 


Coop erative Integrated Reading and Composition (CIRC) 


Stevens and Sla\in(l995a) 


Matched (L) 


2 years 


7 schools 
(3E,4C) 

63 dasses 
(3 IE, 32C) 
1299 students 
(63 5E. 664C) 


2-6 


Working-class suburb of 
Baltimore 
9°oFL, 95 %W 


CAT 




+0.23 


Vocabulary 


-0.20 


Comprehension 


-0.26 


Stevens & Slavin (1995b) 


Matched (L) 


2 years 


5 schools 
(2E.3C) 

45 dasses 
(21E, 24C) 
873 students 
(41 IE, 462C) 


2-6 


Suburban district in 
Maryland 
10% FI, 93 %W 


CAT 




+0.25 


Comprehension 


+0.28 


Vocabulary 


-0.21 


Jenkins et al. (1994) 


Matched (L) 


1 year 


2 schools 
S60 students 

(332 E, 528 C) 


1-6 


Mount Vernon. WA 
36%FL 


MAT 






Comprehension 


+0.09 


+0.18 


Vocabulary' 


+0.31 


Total 


+0.18 


Stevens, Madden, Slavin, & 
Famish (1987; Stud, - 1) 


Matched (L) 


12 weeks 


10 schools 
(6E,4C) 
21 dasses 
(11E, 10C) 


3-4 


Middle-class suburb of 
Baltimore 

4%FL. 84% W, 16%A\ 


CAT 




+0.18 


Comprehension 


+0.19 


"Vocabulary 


+0.17 


Stevens, Madden, Slavin, & 
Famish (1987; Stu<h'2) 


Matched (L) 


6 months 


9 schools 
(4E, 5C) 
22 dasses 
(9E, 13C) 
450 students 


34 


Middle-class suburb of 
Baltimore 

18%FL. 78% W, 22% AA 


CAT 




+0.45 


Comprehension 


+0.35 


Vocabulary 


+0.11 


Total 


+0.23 


Durrell 


+0.54 


Bramlett (1994) 


Matched (L) 


1 year 


S schools 
(9 C, 9 E) 

1 8 classes 
392 students 
(19 SE. 1940 


3 


Rural southern Ohio 


CAT 




+0.08 


Comprehension 


+0.10 


Total Reading 


+0.07 


Word Analysis 


+0.10 


Vocabulary 


+0.03 
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Studv 


Design 

Large/Small 


Duration 


N 


Grade 


SaniDle Characteristics 


Posttest 


Effect Sizebv 
Sub gnouD/Mea sure 


Overall 
Effect Size 








Rapp (1991) 


Matched (S) 


l year 


2 schools 
(1 E, 1 C) 
SS students 

(43 E, 45 C) 


3 


Working-dass schools in 
Lewistown, ID 


ITBS 




+0.14 


Comprehension 


+0.09 


Vocabulary 


+0.18 


Calderon, Hertz -Lazarowitz, & 
Slavin (199S) 


Matched (S) 


2 years 


7 schools 
(3E, 4C) 
Year 1: 
$4 students 
(5 IE, 33 C) 
Year 2: 
59 students 
(26E, 33 C) 


2 and 3 


Spanish- dominant students 
transitioning to Engli shin 
hi gh-poverty scho ol s ne ar 
die Mexi can border in 
Texas. 

79% H 


STAAS 2nd graders 


+0.30 


+0.87 


NAPT 3rd graders 




1 year 


+0.62 


2 years 


+0.87 


Skeans (1991) 


Matched post hoc 

(L) 


19months 


630 students 
(34S E, 282 C) 


3 and 5 


Suburban district near 
Houston 


MAT: 3rd grade 




-0.03 


Vocabulary 


+020 


Comprehension 


+0.08 


MAT: 5th grade 




Vocabulary 


-0.15 


Comprehension 


-024 


Reader's Theater 


Canick (2000) 


Matched (S) 


IT weeks 


9S students 

(53E, 45 C) 


5 


Urban New Jersey 
80% FL, 85%AA, 11%H 


Compared to control 




+029 


Terra Noya 


+022 


Oral Reading 


+0.50 


Compared to paired 
reading 




Terra Nova 


+0.12 


Oral Reading 


+0.30 


Same- Age Tutoring Programs 


PALS 


Fuchs, Fuchs, Kazdan, & Allen 
(1999) 


Randomized quasi- 
experiment (S) 


2 1 weeks 


45 students 
15 students each in 
PALS, PALS-HG (PALS 
+ tutoring strategies), and 
control 


2-3 


Students in a southeastern 
at y 

24% FL, 62% W, 3 8% AA 


SDRT Reading 
C am prehension 




+0.36 


PALS 


+0.72 


PAL S HG 


0.00 
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Study 


Design 

Large/Small 


Duration 


X 


Grade 


SamDle Characteristics 


Posttest 


Effect Size by 
Sub gro udAI ea sure 


Overall 
Effect Size 








Same-Age Tutoring + Strate 


gy Instruction 


Van Keer & V erhaeghe (2005) 


Matched (L) 


1 year 


Second graders: 
1 1 classes 
(5E, 6C) 
215 students 
(91E, 124C) 
Fifth graders: 
10 classes 
(4E, 6C) 
208 students 
(10 IE, 107C) 


2,5 


Middle class schools in 
Flanders. Belgium 


DutchReading 
Comprehension Tea 




+0.29 


2nd graders 


-0.17 


5th graders 


*0.40 


Van Keer & V erhaeghe (2008) 


Matched (L) 


1 year 


Second graders: 
12 classes 
(6E, 6C) 
234 students 
(110E, 124C) 
Fifth graders: 
15 classes 
(9E, 6C) 
293 audents 
flS6E. 1070 


2,5 


Middle class schools in 
Flanders. Belgium 


DutchReading 
Comprehension Tea 




+0.24 


2nd graders 


*0.26 


5 th graders 


*0.21 


Cross-Age Tutoring Programs 


Reading Together 


Policy' Studies Associates (2007) 


Randomized (S) 


1 year 


124 students 
(56E, 6SC) 


2 


School in Irving TX 


Terra Nova 




-0.01 


Cro ss-Age T utoring 


Hilger(2000) 


Matched (S) 


1 year 


1 school 
72 students 

(47 E, 35 C) 


3 


Kish- poverty school. 
78% FI; 34%AA, 34% 
Asian. 26% \V,5%H. 


STAR 


*0.16 


+0.37 


Fluency 


*0.58 
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Studv 


Desian 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Size by 
Sub gro nsAI ea sure 


Orerall 
Effect Size 








Cro ss-Aee T utorine + Strategy 


















Van Keer & V erhaeghe (2005) 


Matched (L) 


1 year 


Second graders: 

9 classes 
(3E, 6C) 

190 students 
(66 E, 124C) 
Fifth graders: 

10 classes 
(■IE, 6C) 

276 students 
1169E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.27 


2nd graders 


^0.22 


5th graders 


->0.32 


Van Keer & V erhaeahe (2008) 


Matched (L) 


1 year 


Second graders: 
14 classes 
(8E, 6C) 
286 students 
(162E. 124C) 
Fifth graders: 
13 classes 
(7E, 6C) 
263 students 
fl56E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.35 


Second graders 


+0.42 


Fifth graders 


-0.28 


Strategy 


nstruct 


ion 


Belaian Strategy Model 


Van Keer & V erhaeahe (2005) 


Matched (L) 


1 year 


Second graders: 

14 classes 
(SE, 6C) 
287 audents 
(163E, 124C) 
Fifth graders: 
14 classes 
(8E, 6C) 
284 students 
(177E. 1070 


2,5 


Middle class schools in 
Flanders, Belgium 


DutchReading 
Comprehension Tea 




+0.30 


Second araders 


-0.24 


Fifth graders 


^0.35 
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Studv 


Design 

Large/Small 


Duration 


X 


Grade 


Sample Characteristics 


Posttest 


Effect Size bv 
Sub sro up /M ea sure 


Overall 
Effect Size 








ThinkmgMaps 


Leary (1999) 


Matched (S) 


1 year 


2 schools 
(IE, 1C) 
7S students 
f41E. 370 


4 


High-poverty schools in 
southeast emVA 
79% FL, 69% AA, 3 1% W 


SAT-9 




+0.31 


Hickie (2006) 


Mat died post hoc 

(S) 


2 wars 


2 schools 
(IE, 1C) 
54 students 
(24E, 30 C) 


4-5 


High-poverty white schools 
in northe astern TN 
91%FL 


TCAP 




+0.70 


Foundations and Frameworks 


Blackmon (2008) 


Matched (S) 


1 war 


5 schools 
(3E, 2C) 
103 students 
(52E.51C) 


4-5 


Philadelphia Christian 
schools; 

predominantly AA H 


Gates MacGinitie 




-0.02 


Comprehension 


-0.08 


Vocabulary 


R1.04 


Recro rocal Teaching 


Sparer, Brunstein, & Kieschke 

(2007) 


Matched (S) 


19 weeks 


105 students 


3-6 


Middle-dass schools in 
Germany 


G erm an st andardi z ed 
comprehension test 




+0.57 


Fluencv Instruction 


FORI 


Kuhn et al (2006) 


Randomized 
quasi -experiment 

(s) 


1 war 


5 schools 
(3E. 2C) 
227 audents 
(143E, 84C) 


2 


High poverty schools inNJ 
andGA 

58% FL. 5 1% AA 23% W. 
21% R 5% Asian 


TOWRE 


-i-0.29 


+0.19 


GORT-4 


-K).l 0 


\M4I 


-*-0.1 8 


Structured Phonetic Intervention Programs 


Exeinp larv Center for Reading Instruction (ECRI) 


Reid (1996) 


Matched post hoc 

<L) 


1 war 


5 schools 
(4E, 1C) 
921 students 
('590E. 33 1C1 


2-6 


High-poverty 
schools in eastern TN 
99% W 


SAT 




+0.65 


Comprehension 


+0.71 


Vocabulary 


+0.59 


Cohen (1991) 


Matched post hoc 

(L) 


1 war 


473 students 

(242E.231C) 


3 


Urban school district 
45% AA 34% W, 21% H 


ITBS 




+0.14 


Comprehension 


+0.07 


\'ocabularv 


+0.21 
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Studv 


Design 
Large'S m a 11 


Duration 


N 


Grade 


Samnle Characteristics 


Posttest 


Effect Size by 
Sub gro ut> /M ea sure 


Overall 
Effect Size 








Phonics-Based ProfessionalPeveloDment 


Language Essentials for Teachers of Reading and S 


o elling (LE TRS) 


Garet et al. (2008) 


Randomized (L) 


1 year 


90 schools 
5530 students 
(1983 LETRS, 
173SLETRS - 
Coaching 
1809 0 


2 


6 urban districts 

78% FL, 78%AA, 15%W, 
5%H 


Various state 
assessments 




+0.06 


LETRS 


+0.08 


LETRS + C oaching 


+0.03 


Integrated Language Arts Pro grams 


Literature -Based Program 




Morrow (1992) 


Randomized quasi- 
experiment (S) 


1 year 


9 classes 
166 students 
(56 LBP- parents, 
46LBP only. 
64C) 


2 


Students in two suburban 
schools inNJ 

24% FL, 43% AA, 37% W, 
14% Asian 


CAT 




+0.21 


School - home 


+021 


School only - 


-0.20 


Success in Reading and Writing 




Lindsey (198 8) 


Matched (S) 


1 year 


2 schools 
(IE, 1C) 
9' students 
(56E. 41C) 


2-3 


Elementary - schools in die 
Padfic Northwest 


CAT 




-0.11 


Comprehension 


-0.23 


Vocabulary 


+0.01 


C'arbo Reading Sh ies 




Oglesby & Suter (1995) 


Matched (S) 


1 year 


13 dasses 
(6 E, 7 C) 
19S students 
(105 E, 93 C) 


3 and 6 


Urban school in die mid- 
south 

80% AA, 20%W, 81% 
remedial. 


Gates MacGinide 




+0.27 


Classroom Management and Motivation Programs 


Consistency Management-CooDerath eDiscioline <C 


MCD) 


Freiberg Prokosch, Treiser, & 
Stein (1990) 


Matched post hoc 

<L) 


2 years 


10 schools 
(5E. 5C) 
699 students 
(364E, 33 5C) 


2-5 


Kigh-poverty schools in 
Houston 

72%FL, 90%AA 


MAT-6 
(grades 2-5) 


+0.09 


+0.12 


TEAMS 
(grades 3 and 5) 


+0.14 
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Studv 


Design 
Large Small 


Duration 


N 


Grade 


Sample Characteristics 


Posttest 


Effect Sizebv 
Sub gro up /M ea sure 


Overall 
Effect Size 








Opuni (2006) 


Matched post hoc 

(L) 


l year 


14 schools 
(7E,7C) 
456 students 
(22 8E. 2280 


3 


High-poverty schools in 
Newark, NJ 
78% FL, 90% AA 


SAT-9 




+0.26 


Student Success Skills 


Campbell andBrigman(2005) 


Randomized (L) 


6 months 


20 schools 
480 students 
(240E.240C) 


5 -6 


Low-achieving students in 
Honda 

62% FL, 82% W’, 9% AA 
5%H 


FCAT 




+0.23 


Resp o nshe Classroo m 


Rimm -Kaufman Fan, Chiu, & 

You (2007) 


Matched post hoc 

(I) 


3 years 


6 schools 
(3E,3T) 

3 groups: 
grades 2-5 
381 students 
(21 IE, 170C) 
grades 3-5 
502 students 
(282E.220C) 
grades 4-5 
506 students 
(266E, 240C) 


2-5 


Schools in a northeastern 
urban district, 

35%FL, 57% W’, 22% AA 
21% H 


DRP 




+0.15 


Grades 2-5 


-K1.21 


Grades 3-5 


+0.16 


Grades 4-5 


+0.07 



Note: L=large study with at least 2 50 students; S=small study withless than 250 students; E=Experimental; C=Cantrol; CAT=California Achievement Test MAT=Metropolitan Achievement 
Test ITBS=Iowa Tests of Basic Skills; STAAS=Texas Assessment of Academic Skills-Spanish;NAPT-Norm -Referenced Assessment Pro gram for Texas; SDRT=Stanford Diagnostic 
Reading Test; SAT=Stanford Achievement Test; TC AP=Tennessee Comprehensive Assessment Program; PAL S=Peer- Assisted Learning Strategies; PAL S-HG=Peer-Assisted Learning 
Strategies with Help-Giving Training; TOWRE=Test of Word R e ading E ffi a ency GORT=Gray Oral Reading Test; GRADE =Group Reading Assessment and Diagnostic Examination; 
STAR=Standardized Test for Assessment of Reading; \\IAT= Wechsler Individaul Achievement Test TEAMS=Texas State Assessment of Academic Skills; SAT=Scholastic Achievement 
Test DRP=Degrees of Reading Power; FCAT=Flondas Comprehensive Assessment Test FL= Free Reduced lunch W= White, AA=Afri can American H=Hispanic, CTBS=C omprehensive 
Test of Basic Skills. 



77 



The Best Evidence Encyclopedia is a free web site created by the Johns Hopkins University School of Education ’s Center for Data-Driven Reform in Education (CDDRE) under funding from the 
Institute of Education Sciences, U.S. Department of Education. 



